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Abstract 

We report the isolation and characterization of a novel bat coronavirus which is much 
closer to the SARS coronavirus (SARS-CoV) in genomic sequence than others 
previously reported, particularly in the S gene. Cell entry and susceptibility studies 
indicated that this virus can use ACE2 as receptor and infect animal and human cell 
lines. Our results provide further evidence of bat origin of the SARS-CoV and 


highlight the likelihood of future bat coronavirus emergence in humans. 
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Text 
The 2002-3 outbreak of severe acute respiratory syndrome coronavirus (SARS-CoV) 
was a significant public health threat at the beginning of the twenty-first century (1). 
Initial evidences showed that the masked palm civet (Paguma larvata) was the 
primary suspect of the animal origin of SARS-CoV (2, 3). Later studies suggested that 
Chinese horseshoe bats are natural reservoirs and masked palm civet most likely 
served as an intermediate amplification host for SARS-CoV (4, 5). From our 
longitudinal surveillance of bat SARS-like coronavirus (SL-CoV) in a single bat 
colony of the species Rhinolophus sinicus in Kunming, Yunnan Province, China, we 
found a high prevalence of diverse SL-CoVs (6). Whole genome sequence 
comparison revealed these SL-CoVs have 78%-95% nucleotide (nt) sequence 
identities to SARS-CoV with the major difference located in the spike protein (S) 
genes and the ORF8 region. Significantly, we have recently isolated a bat SL-CoV 
(WIV1) and constructed an infectious clone of another strain (SH014) which are 
closely related to SARS-CoV and capable of using the same cellular receptor 
(angiotensin-converting enzyme, ACE2) as for SARS-CoV (6, 7). Despite the high 
similarity in genomic sequences and receptor usage of these two strains, there is still 
some difference at N-terminal domain of the S proteins between SARS-CoV and 
other SL-CoVs, indicating that more similar viruses are circulating in bat (s). 
Here we report the isolation of a new SL-CoV strain, named bat SL-CoV 
WIV16. SL-CoV WIV 16 was isolated from a single fecal sample of Rhinolophus 


sinicus which was collected in Kunming, Yunnan Province, in July, 2013. The full 
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genomic sequence of SL-CoV WIV16 (GenBank number: KT444582) was 
determined and contained 30,290 nt in size and a poly (A) tail, which is slightly 
larger than that of SARS-CoVs and other bat SL-CoVs (6, 8-13). The WIV 16 
genome has a 40.9% G+C content and short untranslated regions (UTRs) of 264 and 
339 nt at the 5’ and 3’ termini, respectively. Its gene organization is identical to 
WIV! and slightly different from the civet SARS-CoV and other bat SL-CoVs due to 
an additional ORF (name ORFx) detected between the ORF6 and ORF7 genes of the 
WIV1 and WIV 16 genomes (data not shown). The conserved transcriptional 
regulatory sequence was identified upstream ORFx, indicating this is likely to be a 
potential functional gene. The overall nt sequence of WIV 16 shared 96% identity, 
higher than any previously reported bat SL-CoVs, with human and civet 
SARS-CoVs (Table 1) (4-6, 8-13). A detailed comparison of protein sequences 
between the SARS-CoV GZ02, a strain from an early phase patient, and all reported 
bat SL-CoVs indicated that WIV 16 is the closet progenitor of the SARS-CoV in 
most proteins, particularly in the S protein (Table 1). 

The S protein is responsible for virus entry and is functionally divided into two 
domains, denoted S1 and $2. The S1 domain is involved in receptor binding and the 
S2 domain for cellular membrane fusion (14). S1 is functionally subdivided into two 
domains, an N-terminal domain (S1-NTD) and a C-domain (S1-CTD), both of which 
can bind to host receptors and hence function as receptor-binding domain (RBDs) 
(15). All isolates of SARS-CoV and SL-CoV share high identity in both nt and amino 


acid (aa) sequences in $2 region but highly diverse in their S1 regions. The WIV16 S 
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gene shared 95% sequence identity at nt level and 97% at aa level, respectively, with 
SARS-CoVs, much higher than that of WIV1 with 88% at nt level and 90% at aa level, 
respectively. Different from other bat SL-CoVs, the S1-NTD of WIV16 is much more 
similar to that of SARS-CoV (Fig. 1). The S1-NTD of WIV 16 shared aa sequence 
identity of 94% with SARS-CoVs, but only 50%-75% with other bat SL-CoVs. It’s 
worth to note that the WIV16 RBD (aa 318-510) shared 95% sequence identity with 
SARS-CoV, but is almost identical with WIV1. Thus WIV16 S gene is likely a 
recombinant of WIV1 and a recent ancestor of SARS-CoV. 

High sequence conservation of the WIV16 RBD with that of SARS-CoVs 
predicts that WIV 16 is likely to also use ACE2 as a cellular entry receptor. This was 
confirmed by infection of HeLa cells expressing ACE2 from human, civet and 
Chinese horseshoe bat, respectively (Fig. 2A). Cell susceptibility test using different 
cell lines further indicated that WIV 16 has the same host range as WIV1 (Fig. 2B) 
(6). 

To assess whether the major sequence difference of the S1-NTD will have an 
effect on virus entry and/or replication, the growth kinetics of the two viruses was 
comparatively studied. Vero E6 cells were infected with WIV1 or WIV 16 at MOI of 1 
and virus production in the medium supernatant was determined at four time points 
post infection by quantification of viral RNA (Fig. 3, see figure legend for more 
technical detail). The two viruses grew at a very similar rate with WIV 16 slightly 
slower than WIV1 during the 48-hr duration examined in this study. It is hard to 


conclude whether this subtle difference is significant and related to the S1-NTD 
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sequence difference. Further investigation with more cell lines is required to confirm 
this preliminary observation. 

In conclusion, we isolated and characterized a novel bat SL-CoV isolate WIV 16 
which is the closest ancestor to date of the SARS-CoV. Our results provide further 
evidence that Chinese horseshoe bats are natural reservoirs of SARS-CoVs. It should 
be noted that the WIV16 is not the closest strain to the human SARS-CoVs with 
regards to ORF8. A full-length ORF8 is present in several SARS-CoV genomes of 
early phase patients, all civet SARS-CoVs and bat SL-CoVs. It is split into two ORFs 
(ORF8 a & b) in most of human SARS-CoVs from late phase patients due to a 
deletion event in this part of the genome (3). Recently two papers reported that they 
found a full-length ORF8 which share higher similarities to the SARS-CoV GZ02 and 
civet SARS-CoV SZ3, suggesting that SAS-CoV derived from a complicated 
recombination and genetic evolution among different bat SL-CoVs (10, 12). Taking 
together, we predict that there are diverse SL-CoVs to be discovered in bats. 
Continued surveillances of this group of viruses in bats will be necessary and 
important not only for better understanding of spill over mechanism, but also for more 


effective risk assessment and prevention of future SARS-like disease outbreaks. 
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Figure legends 


FIG 1 Similarity plot based on the nucleotide sequence of the S gene of bat SL-CoV 
WIV16.S genes of human/civet SARS-CoVs and bat SL-CoV WIV1 were used as 
reference sequences, with window of 200 bp, a step size of 20 bp, under Kimura 


model. 


FIG 2 Receptor analysis (A) and susceptibility test (B) of bat SL-CoV WIV16. 

A, HeLa cells with and without the expression of ACE2. ACE2 expression was 
detected with goat anti-human ACE2 antibody followed by fluorescein isothiocyanate 
(FITC)-conjugated donkey anti-goat IgG. Virus replication was detected with rabbit 
antibody against the SL-CoV Rp3 nucleocapsid protein followed by cyanine 3 
(Cy3)-conjugated mouse anti-rabbit IgG. Nuclei were stained with DAPI (4’, 
6-diamidino-2-phenylindole). The columns (from left to right) show staining of nuclei 
(blue), ACE2 expression (green), virus replication (red) and the merged triple-stained 
images. b, bat; c, civet; h, human. 

B, Virus infection in A549, LLC-MK2, RSKT, PK15, H292 and Vero-E6 cells. 

The columns (from left to right) show staining of nuclei (blue), virus replication (red), 
and the merged double-stained images. A549 and H292,human lung cells; LLC-MK2, 
macaque kidney cells; RSKT, Chinese horseshoe bat kidney cells; PK15, pig kidney 


cells; Vero-E6, African green monkey kidney cells 
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FIG 3 One-step growth curve of bat SL-CoV WIV 16 compared with WIV1. 

Vero E6 cell was infected by WIV16 or WIV1 at an MOI of 1. Supernatants were 
collected at 0, 12, 24 and 48 h, post infection. The viruses in the supernatant were 
determined by one-step reverse real-time PCR (n=3), virus RNA that extracted from 
virus with known titer was used to set up the standard curve; error bars represent 


standard deviation. 
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Table 1. Genomic comparison of SARS-CoV GZ02 with civet SARS-CoV and other bat SL-CoVs. 


FLor No.of No. of Identity nt/aa(%) 
ORFs nt aa 873 WIVI6 wiv1 Rs3367 RsSHC014 Rs672 Rp3 Ril Rm LYRall  HKU3-1. YNLF_31C_— BM48-31 
FL - _ 99.8 96.0 95.6 95.7 95.4 93.4 92.6 87.8 88.2 90.9 87.9 93.5 7B.8 
Pla 13,134 4377 99,9/99.9 96.6/98.1 96.9/98.0 96.9/98.0 96.8/98.1 96.4/98.1 94.9/96.6 88.0/94.3 88.0/93.6 91.0/95.9 88.2/94.3 96.0/97.3 76.9/81.6 
Plb 8088 2,695 99,9/99.9 96.1/99.1 96.3/99.4 96,3/99.4 96,4/99.5 96.0/99.3 96.2/99.1 90.9/98.3 91.4/98.6 93.8/98.9 90.9/98.6 96.8/99.2 85.5/96.0 
Ss 3,768 1,255 99.6/99.0 95.4/97.3 90.2/92.4 90,2/92.5 88.4/90.2 77.6/80.1 78.1/80.2 75.5/78.4 78.0/80.6 83.3/89.9 TT0/79.4 76.1/79,2 70.9/76.0 
(S1)* 2,040 680 99.5/98.8 92.6/95.4 83.3/86.5 83.4/86.8 79.9/82.4 68.8/67.0 69.1/66.7 66.7/66.1 69.0/67.4 80.3/84.4 69.2/67.2 67.5/66.7 65.8/64.5 
(S2)*, 1,728 575 99,8/99.3 98.3/99.5 98.3/99.5 98,2/99.3 98,3/99.5 88.0/95.5 88.4/96.2 85.5/92.7 88.3/96.0 87.3/96.3 85.9/93.9 86.0/93.7 76.7/89.6 
~~ ORF3a 825 274 99.0/97.8 99.2/98.2 99.0/97.8 99.2/98.2 99,3/98.2 90.4/90.8 84.0/84.3, 88.6/86.9 83.5/84.3, 89.7/91.6 83.0/82.5 89.0/88.3 TAILS 
Hy E 231 76 100.0/100.0 99.1/100.0 99.1/100.0 99,1/100.0 98.7/98.7 99,6/100.0 97.8/100.0 96.5/96.1 96.1/98.7 98.3/98.7 97,4/100.0 99.6/100.0 90.0/92.1 
me) M 666 221 99,8/99.5, 97.4/98.2 97.4/98.2 97.4/98.2 97,.4/97.7 97.7/98.6 93.4/97.3 95.5/97.7 94.7/97.3 94.7/97.7 95.0/98.6 95.9/98.6 815/914 
> ORF6 192 63 100.0/100.0 95.3/92.1 95.8/93.7 97.9/96.8 97,4/96.8 97 4/98.4 94.8/92.1 94.8/93.7 94.8/92.1 94.3/95.2 94.8/93.7 92.7/88.9 65.1/50.0 
Oo ORF7a 369 122 100.0/100.0 94.3/95.1 94.9/95.1 94,9/95.9 94.6/95.9 94,3/95.9 93.8/95.1 92.1/91.8 93.0/93.4 93.2/94.3 93.0/94.3 96.7/96.7 63.9/58.5 
we ORF7b 135 44 100.0/100.0 96.3/93.2 95.6/93.2 95,6/93.2 96,3/93.2 95.6/93.2 96,3/93.2 94.1/90.9 95.6/93.2 86.7/90.9 92.6/93.2 97.0/93.2 65.0/70.0 
5 ORF8 369 122 99.5/98.4 50.1/38.6 50.7/39.5 50.7/39.5 50.7/40.4 51.6/39.5 53.3/39.5 82.1/81.8 52.1/39.5 51.0/38.3 52.1/37.7 82.1/82.6 N/A 
as N 1,269 422 99,9/100.0 98.4/99.5 98.4/99.8 98.7/100.0 98,3/99.5 97.6/98.6 96.7/98.1 94.2/95.7 96.4/97.9 96.9/97.9 96.2/96.7 97,2/98.3 78.5/88.2 


SARS-CoV GZ02 was isolated from patients of early phase of the SARS outbreak in 2003. SARS-CoV SZ3 was identified from Paguma larvata in 2003 collected in 
Guangdong, China. SL-CoV WIV16, WIV1, Rs3367 and RsSHCO14 were identified from Rhinolophus sinicus collected in Yunnan, China, during 2011 to 2013. SL-CoV 
YNLF_31C was identified from R. ferrumequinum collected in Yunnan, China, in 2013. SL-CoV LYRal11 was identified from R. affinis collected in Yunnan, China, in 2011. 
SL-CoV Rs672, Rp3 and HKU3-1 were identified from R. sinicus collected in China (respectively: Guangxi, 2004; Guizhou, 2006; Hong Kong, 2005). Rfl and Rm1 were 
identified from R. ferrumequinum and R. macrotis, respectively, collected in Hubei, China, in 2003. Bat SARS-related CoV BM48-31 was identified from R. blasii collected 
in Bulgarian in 2008. FL, full-length genome. *S1, the N-terminal domain of the S protein (aa 1-680). S2, the C-terminal domain of the S protein (aa 681-1255). The pairwise 


comparison was conducted for all ORFs at nucleotide acids (nt) and amino acids (aa) levels. The full-length genome was compared at nt level. N/A, not available. 
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